Beyond SGD: Efficient Learning with Non i.i.d. Data



Kfir Y. Levy (Technion)

I did my post-doc at the Institute for Machine Learning at ETHZ working with Prof. Andreas Krause. Previously, I did my PhD at the IE&M Department of the Technion, working under the guidance of Prof. Elad Hazan. Before that, I completed my master's at the EE Department of the Technion under the guidance of Prof. Nahum Shimkin.



Short Abstract: The tremendous success of the Machine Learning paradigm heavily relies on the development of powerful optimization methods. The canonical algorithm for training learning models is SGD (Stochastic Gradient Descent), yet this method has several limitations. In particular, it relies on the assumption that data-points are i.i.d. (independent and identically distributed); however this assumption does not necessarily hold in practice. In this talk I will discuss an ongoing line of research where we develop alternative methods that resolve this limitation of SGD in two different contexts. In the first part of the talk I will describe a method that enables to cope well with contaminated data. In the second part, I will discuss a method that enables an efficient handling of Markovian data. The methods that I describe are as efficient as SGD, and implicitly adapt to the underlying structure of the problem in a data dependent manner.